Properties, quantile regression, and application of bounded exponentiated Weibull distribution to COVID-19 data of mortality and survival rates

Well-known continuous distributions such as Beta and Kumaraswamy distribution are useful for modeling the datasets which are based on unit interval [0,1]. But every distribution is not always useful for all types of data sets, rather it depends on the shapes of data as well. In this research, a three-parameter new distribution named bounded exponentiated Weibull (BEW) distribution is defined to model the data set with the support of unit interval [0,1]. Some fundamental distributional properties for the BEW distribution have been investigated. For modeling dependence between measures in a dataset, a bivariate extension of the BEW distribution is developed, and graphical shapes for the bivariate BEW distribution have been shown. Several estimation methods have been discussed to estimate the parameters of the BEW distribution and to check the performance of the estimator, a Monte Carlo simulation study has been done. Afterward, the applications of the BEW distribution are illustrated using COVID-19 data sets. The proposed distribution shows a better fit than many well-known distributions. Lastly, a quantile regression model from bounded exponentiated Weibull distribution is developed, and its graphical shapes for the probability density function (PDF) and hazard function have been shown.


New proposed BEW distribution
Consider, a CDF and a probability density function (PDF) of an exponentiated Weibull distribution respectively are given by, Now, a distribution termed as the BEWD is developed following the conversion of e −X = Y → −ln(Y) = X .The CDF of the BEWD is as follows, or Now, the following is a CDF of the BEW distribution, where y = ϒ, is a location parameter, while α & β are respectively shape and scale parameters.

Reliability measures of BEW distribution
In this section a few reliability measures such as survival function, hazard function, reversed hazard function, cumulative hazard function, odd function, elasticity, and mills ratio for the BEW distribution have been discussed.
The survival function represents the probability that an individual will survive beyond a certain time, denoted as y.In the case of the BEW distribution, the survival function can be expressed as, On the other hand, the hazard function characterizes the death rate of an individual at a specific age, denoted as y For the BEW distribution, the hazard function can be calculated as, Additionally, the reverse hazard function determines the fraction of the life probability density to its distribution function.In the case of the BEW distribution, the reverse hazard function can be defined as, r h y = F ′ (y) The cumulative hazard function for the BEW distribution is The odd function for the BEW distribution is The elasticity for the BEW distribution is Mills ratio for the BEW distribution is By utilizing the mathematical expressions stated above, researchers can analyze or model the data using the survival, hazard, and reverse hazard characteristics associated with the BEW distribution.
From Fig. 1, the PDF graphs of BEW distribution show a variety of shapes, positively/negatively skewed, symmetrical, U shapes, reverse J shape.The hazard function of the BEW distribution shows bathtub shape.

Some distributional properties of BEW distribution
In this section some fundamental distributional properties such as quantile function, median, inter quartile range, moments, moment generating function (mgf), mean, variance and standard deviation for the BEW distribution have been presented.
The quantile function (QF) is the inverse of the CDF of any PDF.The QF of the BEW distribution is as follow: The median and Inter quartile range (IQR) for the BEW distribution are calculated as Median = y 0.5 and IQR = y 0.75 − y 0.25 in (12).
The rth moments for the BEW distribution are defined as , where, p = 1 − q.
Theorem 1.An rth incomplete central moment of BEW distribution is given below.
dy, it is known as the incomplete beta function.
Proof.Let the random variable Y follow the PDF given in Eq. ( 4), then the incomplete moments are.
Let z = ϒ β , and simplifying it we get.
So, the above expression becomes the expression given in Eq. ( 17).
Theorem 2. The Lorenz curve L F y for the BEW distribution is defined as

Generalization of proposed methodology: a bivariate version of a BEW distribution
Many researchers are making prognostications regarding the relationship between the two numerical variables in a dataset, such as the correlation between an individual's age and BMI.Bivariate distributions serve as a valuable tool to observe the independence between variables and evaluate the dependability of products, particularly in insurance risk analysis, economics, and waiting time analysis.Within this section, an extended form of the BEW distribution known as the bivariate bounded exponentiated Weibull distribution (B-BEWD), is presented.We provide a illustration of CDF and PDF of the B-BEWD below.
Figure 3 shows the PDF plots for the given parameter values.

Parameter estimation methods
Six different methods for the estimation of the parameters have been covered in this section.These methods include maximum likelihood estimation (MLE) Cramér-von Mises estimation (CVME), ordinary least squares estimation (OLSE), weighted least squares estimation (WLSE), Percentile estimation (PC), and Anderson-Darling estimation (ADE) methods.

Maximum Likelihood Estimation
In   To estimate the values of the parameters of BEW, taking derivative of Eq. ( 22) with respect to α, βand respectively, and we obtain Because the above equations do not have a closed form, the non-linear system of equations T therefore these equations can be numerically solved to find the parameter estimates.

Ordinary and Weighted Least Squares Estimation Methods
Let Y 1 , Y 2 , . . ., Y n be the ordered values from the BEW distribution with distribution function F(Y).For a sample of size n, we have E F Y (i) = i (n+1) .The least-square estimator parameters α , β and for the BEW distribution are estimated by minimizing.
In the case of BEW distribution, Eq. ( 23) becomes.
Take the partial derivative of (24) with respect to the parameters to determine the estimates for α, β, and .The following equations are, where and By simplifying Eq. (25) the WLS estimates αWLS , βWLS and ˆ WLS , can obtain by minimizing.

Cramér-Von Mises estimation
Let Y 1 , Y 2 , . . ., Y n be the ordered values arise from the BEW distribution.The Cramér-Von Mises is used to find the parameters αCVM , βCVM , and ˆ CVM that are find out by minimizing the function that is given below.
Differentiate the Eq. ( 30) with respect to α , β and , the estimates of the parameters can be determined numerically by the equations given below.
where, � s y (i) |α, β, are defined in the section "Ordinary and Weighted Least Squares estimation methods".

Anderson-Darling estimation
Let Y 1 , Y 2 , . . ., Y n be ordered observations arise from BEW distribution.The Anderson-Darling is determined by minimizing the function that are given below to find the parameters αAD and ˆ AD .
These estimators can be derived by solving the non-linear equations that are given below.

Percentile estimation
Let Y 1 , Y 2 , . . ., Y n be ordered observations came from BEW distribution and u i = i n+1 is an unbiased estimate of F Y y (i) ; α, β, .The PC estimates for the BEW distribution parameters are derived by minimizing the follow- ing function:

Simulation study
In this section a simulation study is represented by using the BEW distribution to assess the performance of the estimators discussed in the previous section and numerical results are obtained.We generate N = 10,000 samples of the size n = (20, 40, 100, 300) from BEW distribution with parameter settings (α = 1, β = 2, = 3 and α = 1.7, β = 0.5, = 2.8) .The random numbers generation is obtained by quantile function of BEW distribution.In this simulation study, we calculate the empirical mean, bias, and mean square errors (MSE's) of all estimators to compare in the terms of their biases and MSE's with varying sample size.and (28) www.nature.com/scientificreports/ In Tables 2 and 3 the simulations study with the help of bias, average bias, MSE and mean relative error (MRE) are shown for small, medium, and large ample sizes.The proposed estimation methods are used such

Applications
This section determines the significance of the BEW distribution and compares its performance with the other competing unit interval distributions.The databases belong to the unit interval observations as the mortality rate of COVID-19 patient (1) in Canada and (2) in the UK, and recovery rate (3) in Spain.
Table 4 explains some descriptive measures for the mortality and recovery rate of COVID-19 data in three countries.The UK data is positive skewed, and Canada and Spain are negatively skewed similar trend is shown by the box plot in Fig. 4.

Mortality rate of COVID-19 in UK
The BEW distribution offers the best fit to the data due to the smallest AIC, BIC, CAID, AD, and highest loglikelihood value and p-value of the KS test.

COVID-19 mortality rate in Canada
The BEW distribution offers the best fit to the data due to the smallest AIC, BIC, CAID, AD, and highest loglikelihood value and p-value of the KS test.

COVID-19 recovery rate in Spain
The BEW distribution offers the best fit to the data due to the smallest AIC, BIC, CAID, AD, and highest loglikelihood value and p-value of the KS test.

BEW quantile regression model
In this section BEW quantile regression model is developed using the quantile of the BEW distribution.When the response variables are bounded in the unit interval, then beta regression models indicating conditional mean responses becomes difficult to apply there.In this study, quantiles of responses are modeled using quantile regression models.Considering the quantile function of the BEW distribution, we developed the PDF for the BEW distribution. Suppose , So, the PDF and CDF of newly developed distribution respectively, are given below.
. −1  www.nature.com/scientificreports/and here ω is the parameter of quantile.The BEW quantile is expressed as where z i ′ = 1, z i1 , z i2 , . . ., z ip are the ith covariate vectors, θ = θ o , θ 1 , . . ., θ p ′ is the vectors of unknown parameters.The quantile ∈ [0, 1] is linked to the covariates using the logit link function.So, we have Substitute the ω i in Eq. ( 36) and we get where The log-likelihood for estimating the parameters bounded exponentiated Weibull quantile regression (BEWQRM) model is provided by where z i is defined above.The regression equations parameters are estimated by maximizing the log-likelihood (LL) function.The parameters will be written as α and θ of α and θ respectively.
The survival function and the hazard function of BEWQRM are given as and Figure 8 represents the density plot for the BEWQRM for various values of parameters.It can be seen that the PDF shows a variety of shapes such as slightly and extremely positively skewed, negatively skewed, and symmetric.Figure 9 shows the hazard rate shapes for the BEWQRM and it exhibits j shapes and reverse j shape. (40) . −1 www.nature.com/scientificreports/

Conclusion
In various real-life situations, the random variable supports the bounded data with support [0,1] and moreover the data set shows a variety of shapes.While we model any data sets the selection of probability distribution is always a complex matter.Here in this article a new unit interval distribution is proposed named as bounded exponentiated Weibull (BEW) distribution to model data sets with support [0,1].Although various unit interval distributions have been developed recently but firstly, every distribution is not suitable for all types of data sets, secondly, Weibull distribution has always attracted researchers due to its wide range of applications.The proposed distribution has a variety of shapes positively/negatively skewed, symmetrical, U shapes and reversed J shape.The hazard rate plot of the BEW distribution shows a bathtub shape.Various characteristics for the BEW distribution including the CDF, QF, median, moments, inequality measures, reliability measures, have been derived.Six different techniques have been investigated for estimating the parameters of the BEW distribution.
A simulation has been conducted to show the performance of estimators.BEW distribution has been applied to three datasets, the data sets are COVID-19 death and recovery rates from the UK, Canada, and Spain.The proposed distribution outperforms as compared to the other competing unit interval distributions.A bivariate extension for the BEW distribution has been developed and its graphical shapes have also been shown.A BEW quantile regression model is also developed to examine the association between covariates and the conditional quantiles of unit interval response variable.

Figure 2
Figure2shows the CDF graphs for the given parameter values.
this section the parameters of the BEW distribution are estimated by the MLE.Let Y 1 , Y 2 , . . ., Y n be a ran- dom sample of size n and let y 1 , y 2 , . . ., y n be a random sample values from the BEW distribution the likelihood function (L) is: we have Then applying the log-likelihood function l = l(ϑ), where ϑ = α, β, and .

Figure 2 .
Figure 2. CDF plots of the B-BEW distribution.

Figure 3 .
Figure 3. PDF plots of the B-BEW distribution.

2
Vol:.(1234567890) Scientific Reports | (2024) 14:14353 | https://doi.org/10.1038/s41598-024-65057-6 contain the fitted distributions' along with values of test statistics with p-values and also the estimated values of the parameters by MLE of the parameters along with their standard errors.Figures 5, 6 and 7 show the comparison of empirical and fitted PDFs and CDFs for the three data sets.

Figure 5 .
Figure 5. Fitted and empirical PDFs and CDFs of the UK dataset.

Figure 6 .
Figure 6.Empirical and fitted PDFs and CDFs of Canada dataset.

Figure 7 .
Figure 7. Fitted and empirical PDFs and CDFs of Spain dataset.

Figure 8 .
Figure 8. PDF plot of BEWQRM for some parametric and quantile values.

Table 3 .
The as maximum likelihood estimator (MLE), Anderson Darling (AD), Cramer-von Mises (CVM), ordinary least square (OLS) and weighted least square (WLS).It is observed that for large (n = 300) and for medium (n = 100) sample sizes MLE is performing better as compared to AD, CVM, OLS and WLS.For small (n = 20) and (n = 50) AD is better than MLE, CVM, OLS and WLS and MLE is better than CVM, OLS and WLS.For the numerical solutions, simulations of the estimation methods including MLE, AD, CVM, OLS, WLS, further analysis and applications, R studio17, and Wolfram MATHEMATICA 13.3 software are used.

Table 4 .
Descriptive information for the mortality rate and recovery rate for COVID-19 patients in the stated countries.

Table 5 .
Model Selection Criteria and Parameter Estimates for UK.

Table 6 .
Model selection criteria and parameter estimates for Canada.

Table 7 .
Model selection criteria and parameter estimates for Spain.